适当的评估和实验设计对于经验科学是基础,尤其是在数据驱动领域。例如,由于语言的计算建模成功,研究成果对最终用户产生了越来越直接的影响。随着最终用户采用差距的减少,需求增加了,以确保研究社区和从业者开发的工具和模型可靠,可信赖,并且支持用户的目标。在该立场论文中,我们专注于评估视觉文本分析方法的问题。我们从可视化和自然语言处理社区中采用跨学科的角度,因为我们认为,视觉文本分析的设计和验证包括超越计算或视觉/交互方法的问题。我们确定了四个关键的挑战群,用于评估视觉文本分析方法(数据歧义,实验设计,用户信任和“大局”问题),并从跨学科的角度为研究机会提供建议。
translated by 谷歌翻译
Recent work has reported that AI classifiers trained on audio recordings can accurately predict severe acute respiratory syndrome coronavirus 2 (SARSCoV2) infection status. Here, we undertake a large scale study of audio-based deep learning classifiers, as part of the UK governments pandemic response. We collect and analyse a dataset of audio recordings from 67,842 individuals with linked metadata, including reverse transcription polymerase chain reaction (PCR) test outcomes, of whom 23,514 tested positive for SARS CoV 2. Subjects were recruited via the UK governments National Health Service Test-and-Trace programme and the REal-time Assessment of Community Transmission (REACT) randomised surveillance survey. In an unadjusted analysis of our dataset AI classifiers predict SARS-CoV-2 infection status with high accuracy (Receiver Operating Characteristic Area Under the Curve (ROCAUC) 0.846 [0.838, 0.854]) consistent with the findings of previous studies. However, after matching on measured confounders, such as age, gender, and self reported symptoms, our classifiers performance is much weaker (ROC-AUC 0.619 [0.594, 0.644]). Upon quantifying the utility of audio based classifiers in practical settings, we find them to be outperformed by simple predictive scores based on user reported symptoms.
translated by 谷歌翻译
Since early in the coronavirus disease 2019 (COVID-19) pandemic, there has been interest in using artificial intelligence methods to predict COVID-19 infection status based on vocal audio signals, for example cough recordings. However, existing studies have limitations in terms of data collection and of the assessment of the performances of the proposed predictive models. This paper rigorously assesses state-of-the-art machine learning techniques used to predict COVID-19 infection status based on vocal audio signals, using a dataset collected by the UK Health Security Agency. This dataset includes acoustic recordings and extensive study participant meta-data. We provide guidelines on testing the performance of methods to classify COVID-19 infection status based on acoustic features and we discuss how these can be extended more generally to the development and assessment of predictive methods based on public health datasets.
translated by 谷歌翻译
The UK COVID-19 Vocal Audio Dataset is designed for the training and evaluation of machine learning models that classify SARS-CoV-2 infection status or associated respiratory symptoms using vocal audio. The UK Health Security Agency recruited voluntary participants through the national Test and Trace programme and the REACT-1 survey in England from March 2021 to March 2022, during dominant transmission of the Alpha and Delta SARS-CoV-2 variants and some Omicron variant sublineages. Audio recordings of volitional coughs, exhalations, and speech were collected in the 'Speak up to help beat coronavirus' digital survey alongside demographic, self-reported symptom and respiratory condition data, and linked to SARS-CoV-2 test results. The UK COVID-19 Vocal Audio Dataset represents the largest collection of SARS-CoV-2 PCR-referenced audio recordings to date. PCR results were linked to 70,794 of 72,999 participants and 24,155 of 25,776 positive cases. Respiratory symptoms were reported by 45.62% of participants. This dataset has additional potential uses for bioacoustics research, with 11.30% participants reporting asthma, and 27.20% with linked influenza PCR test results.
translated by 谷歌翻译
生成的对抗网络(GAN)是在众多领域成功使用的一种强大的深度学习模型。它们属于一个称为生成方法的更广泛的家族,该家族通过从真实示例中学习样本分布来生成新数据。在临床背景下,与传统的生成方法相比,GAN在捕获空间复杂,非线性和潜在微妙的疾病作用方面表现出增强的能力。这篇综述评估了有关gan在各种神经系统疾病的成像研究中的应用的现有文献,包括阿尔茨海默氏病,脑肿瘤,脑老化和多发性硬化症。我们为每个应用程序提供了各种GAN方法的直观解释,并进一步讨论了在神经影像学中利用gans的主要挑战,开放问题以及有希望的未来方向。我们旨在通过强调如何利用gan来支持临床决策,并有助于更好地理解脑部疾病的结构和功能模式,从而弥合先进的深度学习方法和神经病学研究之间的差距。
translated by 谷歌翻译
在本文中,我们提出了一个基于最新优化算法的新的深度展开的神经网络,以进行分析压缩传感。所提出的称为解码网络(DECONET)的网络实现了一种解码器,该解码器从其不完整的嘈杂测量值中重建向量。此外,DeConet共同学习了一个冗余的分析操作员,以进行稀疏,该操作员在整个Deconet层共享。我们研究Deconet的概括能力。为此,我们首先估计假设类别的Rademacher复杂性,该假设类别由Deconet可以实施的所有解码器组成。然后,我们根据上述估计值提供概括误差界限。最后,我们提出了数值实验,以证实我们理论结果的有效性。
translated by 谷歌翻译